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We introduce the concept of random sequential renormalization (RSR) for arbitrary networks. 
RSR is a graph renormalization procedure that locally aggregates nodes to produce a coarse grained 
network. It is analogous to the (quasi-)parallel renormalization schemes introduced by C. Song et 
al. [C. Song et.al, Nature (London) 433, 392 (2005)] and studied by F. Radicchi et al. [F. Radicchi 
et al, Phys. Rev. Lett. 101, 148701 (2008)], but much simpler and easier to implement. Here 
we apply RSR to critical trees and derive analytical results consistent with numerical simulations. 
Critical trees exhibit three regimes in their evolution under RSR. (i) For Nq < iV < No, where N 
is the number of nodes at some step in the renormalization and iVo is the initial size of the tree, 
RSR is described by a mean-field theory, and fluctuations from one realization to another are small. 
The exponent v = 1/2 is derived using random walk and other arguments. The degree distribution 
becomes broader under successive steps, reaching a power law pt ~ 1/k 1 with 7 = 2 and a variance 

1/2 

that diverges as N at the end of this regime. Both of these latter results are obtained from a 

scaling theory, (ii) For A^ 5 '" < N < N^ 2 , with ^ sta r ~ 1/4 hubs develop, and fluctuations between 
different realizations of the RSR are large. Trees are short and fat with an average radius that is 
0(1). Crossover functions exhibiting finite-size scaling in the critical region N ~ A^ 2 — > oo connect 
the behaviors in the first two regimes, (iii) For N < A"g star , star configurations appear with a central 
hub surrounded by many leaves. The distribution of stars is broadly distributed over this range. 
The scaling behaviors found under RSR are identified with a continuous transition in a process 
called "agglomerative percolation" (AP), with the coarse-grained nodes in RSR corresponding to 
clusters in AP that grow by simultaneously attaching to all their neighboring clusters. 



PACS numbers: 02.70.Rr, 05.10.cc, 89.75.Hc, 89.75.Da 

I. INTRODUCTION 

Renormalization is a basic concept in statistical 
physics. It is a process whereby degrees of freedom in 
a system are successively eliminated by coarse graining. 
At the same time system parameters are rescaled to com- 
pensate for the decimation, and the smallest scale is reset 
to its original value [I] . Since a series of such transforma- 
tions is itself a transformation, the transformations {1Z} 
form a semi-group: the "renormalization group" (RG). 

If the system is statistically invariant under {1Z}, one 
speaks of RG invariance. An invariant system exhibits an 
asymptotic fixed point under the RG flow with scaling 
described by homogeneous functions. Prototypical RG 
fixed points are critical phenomena displayed at continu- 
ous phase transitions as for the Ising model, by a-thermal 
systems like directed [2] or ordinary [3J percolation, rel- 
ativistic quantum field theories [4|, or the Feigenbaum 
(period doubling) cascade in one-dimensional dynamical 
systems [5J. Systems with the same fixed point under 
RG are in the same universality class and share the same 
critical exponents. 

It is natural to ask if similar concepts can be applied to 
glean meaningful information about complex networks. 
A positive answer was suggested in Ref. [5J and has 
stirred much interest. In the present paper we start an in- 
vestigation to further explore whether and in what sense 
this can be true. 



For models on a lattice, coarse graining can be accom- 
plished either in Fourier space or in real space. A typical 
real space RG proceeds heuristically by covering a spin 
lattice with a regular grid of boxes, and replacing the 
degrees of freedom in each box by a "super-spin" [3J . In- 
teractions between spins in neighboring boxes are used 
to specify the couplings between super-spins. 

However, many real world phenomena are better repre- 
sented as complex networks rather than regular lattices. 
Although research in this area has exploded in recent 
years (for reviews see, e.g., Refs. [ZHS]), our understand- 
ing of the statistical physics of complex networks has not 
caught up with the vast body of knowledge accrued over 
decades for lattice systems. Some phase transitions on 
networks (e.g., in the spreading of epidemics QZ31II]) are 
straightforward generalizations of critical phenomena on 
lattices. Yet it is not clear whether the RG, and real- 
space renormalization, in particular, can be applied sys- 
tematically to complex networks. 

Closely related to renormalization is the notion of frac- 
tal dimensions [U 0] . Many complex networks are small 
world networks (121 113) . where the number of nodes 
within reach of any node via paths of length r increases 
exponentially with r. Via any standard definition, this 
gives infinite fractal dimensions. However Song et al. [B], 
made claims to the contrary, finding finite fractal dimen- 
sions for several real-world networks based on a quasi- 
parallel renormalization scheme. A real-space RG for 
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networks that is not based on the concept of fractal di- 
mensions, but studied in terms of the flow under renor- 
malization, was proposed by Radicchi et al. Q3] US] . 

A fundamental issue pertinent to all the work up to 
now on renormalization of networks (see, for instance, 
Refs. [3 Q3H2D]) is that completely covering a network 
with equal size boxes leads to a number of unavoidable 
dilemmas that could lead to erroneous conclusions. Con- 
ceptually, covering the system with boxes of equal sizes is 
a flagrant violation of the original idea of Hausdorff [5T] , 
where the system ought to be covered with a partitioning 
whose elements have individually optimized sizes up to 
some largest size r. In most applications this is not a seri- 
ous impediment, and a covering with equal size elements 
gives equivalent results. Thus most estimates of fractal 
dimensions in physics use fixed box sizes, although there 
are well known cases where this leads to erroneous re- 
sults. The most famous one is given by any infinite but 
countable set of points, which according to Hausdorff, 
but not according to any covering algorithm with fixed 
box size, has zero dimension. 

One reason why this problem can be neglected in many 
physical systems is that the number of points per box (or, 
more precisely, the weight of each box) has small fluc- 
tuations, in particular, relative to a distribution whose 
width increases exponentially with box size. For small 
world networks, where, indeed, the maximum number of 
nodes increases exponentially, the schemes of Refs. [rjlUBl- 
120] may give misleading results because most boxes have 
only a few nodes. Then the problems associated with 
fixed box size become acute and there is no reason to 
believe that the results obtained are related to genuine 
fractal dimensions of the underlying graph. 

Even with fixed box size, the covering should also be 
optimized with respect to the exact placement or tiling 
of the boxes, which is an NP hard problem [17]. Heuris- 
tic methods for this optimization have been claimed to 
work [HUB, 20 , but as a matter of fact they depend on 
the order in which boxes are laid down. Thus they are 
not true parallel substitutions of nodes by super-nodes, 
but quasi-parallel since the single step of tiling the whole 
network is implemented as a sequence of partial tilings. 
Combined with the problem of almost empty boxes, this 
means that the efficiency of the box covering algorithm 
changes both within each renormalization step (the boxes 
put down first contain in general more vertices than later 
boxes), and from one step to the next. 

Another problem with the (quasi)parallel renormaliza- 
tion scheme is that each step of renormalization dramati- 
cally reduces the number of nodes in the network. There- 
fore few points and less statistics are obtained for ana- 
lyzing renormalization flow. This becomes particularly 
serious in the case of small world networks which col- 
lapse to one node in a few steps, even when the initial 
network size is huge. This has been overcome to some ex- 
tent in Ref. [22] by performing a renormalization where 
only parts of the network are coarse-grained at each step, 
at the cost of adding more parameters and making the 



results harder to interpret. 

In view of these problems, we decided to study graph 
renormalization for unweighted, undirected networks by 
means of a purely sequential algorithm: At each step one 
node is selected at random, and all nodes within a fixed 
distance of it (including itself) are replaced by a single 
super-node. The super-node has links to all other nodes 
that were connected to the original subset absorbed into 
the super-node. This is repeated until the network col- 
lapses to a single node. 

Our method avoids the problem of finding an optimum 
tiling as well as problems with almost empty boxes. A 
further advantage of our random sequential renormal- 
ization (RSR) procedure is that each step has a much 
smaller effect on the network, and thus the whole renor- 
malization flow consists of many more single steps for a 
finite system and allows for a more fine grained analysis. 

If there are fixed points underlying this RG flow, then 
they will manifest themselves in terms of (finite-size) scal- 
ing laws, which hold for large initial networks at inter- 
mediate times. Here time is measured by the number of 
steps in the RSR. At intermediate times, the system is 
far from both the initial network and the non-invariant 
final network composed of a single super node. 

On any graph, including networks or lattices, the 
super-nodes can be viewed as clusters that grow by at- 
taching to all of their neighboring clusters, up to a dis- 
tance b in the network of clusters. This process, called 
"agglomerative percolation", has been solved exactly in 
one dimension and shown to exhibit scaling laws with 
exponents that depend on b [53]. On a square lattice 
in two dimensions, critical behavior is seen which is in a 
different universality class |24j than ordinary percolation. 
Thus the scaling behavior seen in RSR occurs as a result 
of a type of percolation transition and is not restricted 
to cases where the underlying graph is fractal. 

Here we apply our RSR methodology to critical trees 
and also find evidence for a critical point (which is, how- 
ever, not a fixed point of the RSR!) where the number 
of links attached to any node (i.e., its degree) follows a 
power law and divergences appear for e.g. the variance of 
the degree distribution. The size of the networks at the 
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transition point diverges as N , slower than the initial 
network size (No) in the limit of infinite system size. Be- 
low this transition, renormalizcd trees are short and fat 
with an average depth (or radius) which is 0(1). We de- 
termine some critical exponents using random walk and 
other arguments, as well as a mean-field theory for the 
initial, uncorrelated phase. We use, in addition, the ob- 
servation that all renormalized networks for b = 1 even- 
tually reach a star dominated by a central hub before 
they collapse to a single node. Our results are confirmed 
by means of finite-size scaling analyses of results from nu- 
merical simulations. These simulations also reveal scaling 
behavior for the probability distribution for the sizes of 
networks that first reach a star configuration. This turns 
out to be equivalent to the distribution of sizes one step 
before the network collapses to a single node. Stars first 
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FIG. 1: (Color online) One step of RSR with 6 = 1. The 
randomly chosen target node (red circle), absorbs all its near- 
est neighbors (blue stars). All links to the absorbed nodes 
(from green triangular nodes) are then redirected to the tar- 
get. Alternatively one can view the super-node as a cluster 
(bounded by the red curve) that subsequently grows by in- 
vading its neighboring clusters. 



appear for renormalized networks when the size of the 
network is N < N^ Btar with v atax sa 1 /A. 

In Sec. II, we define the general RSR procedure for any 
network as well as the specific ensemble of networks we 
analyze in this paper. Section III presents our theoretical 
and numerical results for RSR of critical trees. Finally, 
we end with conclusions and outlook for future work in 
Sec. IV. 



performing b successive decimations with radius one on 
the same target. Although this method is slightly slower 
than an optimal coding where all nodes within distance 
< 6 of the target are found and deleted in a single step, it 
reduces code complexity and potential sources of errors. 

For any radius b > 1, RSR exhibits two trivial fixed 
points: a graph consisting of a single node, and an in- 
finitely long chain. For a long but finite chain, the time 
until a single node is reached is T = [JVo/26]. In one 
dimension, the exact probability to find any consecutive 
sequence of node masses for any N and at any time has 
been determined [53]. At late times, and for large No the 
mass distribution of the nodes exhibits scaling both at 
small and large sizes with (different) exponents that de- 
pend on b. For b — 1 another fixed point exists, which is a 
star with infinitely many leaves. In that limit, the prob- 
ability to choose the central hub of the star as the target 
vanishes. With probability one, a single leaf is removed 
during each RSR step. For a finite number iV star — 1 of 
leaves, a star has an average life time T = 0(Af stai .) be- 
fore it collapses into a single node. Notice that simple 
stars are not fixed points for b > 1, as any star reduces 
to a single node in one step with probability one. In this 
paper we study only the case of RSR with 6=1. 



II. THE MODEL 

A. Random Sequential Renormalization 

For any undirected, unweighted graph, RSR with ra- 
dius b (6 = 1, 2, . . .) is defined as follows: Starting with a 
graph with Nq nodes, we produce a sequence of graphs of 
strictly decreasing sizes N t with < t < T and Nt = 1- 
For each step t t + 1 (t is called "time" in the follow- 
ing): 

(i) We choose randomly a target node i G [1, . . . , Nt]. 

(ii) We delete all nodes that can be reached from i by at 
least one path of length 1 < £ < b. 

(iii) We also delete all links between these chosen nodes, 
and all links connecting them to i. 

(iv) Each link connecting any node outside this neighbor- 
hood to a deleted node is redirected towards the target. 

(v) If this creates a multiple link between any two nodes, 
it is replaced by a single link. 

Hence the target node i is replaced by a super-node 
that maintains all links to the outside. Its internal fea- 
tures, however, are erased from the network, consistent 
with coarse graining. Figure [I] shows an example of one 
step of RSR for 6=1. After absorbing its neighbors 
the super-node is treated like any other node and the 
process repeats until the network collapses into a single 
node. One could also vary the probability of choosing 
a target node by a function of its mass (the number of 
nodes absorbed into it), or its degree (the number of links 
attached to it), but these aspects are not explored here. 

When 6=1, only nearest neighbors of the target node 
are deleted. For b > 1 each step can be implemented by 



B. Initial graph ensemble 

The ensemble of critical trees is generated as follows: 
Starting with a single node, each node can have 0, 1, or 
2 offspring with probabilities 1/4, 1/2 and 1/4. (Hence 
the mean number of offspring is 1.) The process runs 
until it dies due to fluctuations. The sizes of trees ob- 
tained in this way are distributed according to an inverse 
power law P(Nq) ~ N^ 3 ^ 2 [3 |. From these we pick a 
large (« 10 2 — 10 3 ) ensemble of trees with the desired 
(large) Ao(±10%), and discard all others. Note that sim- 
ply truncating trees that survive up to No would give a 
biased sampling of the ensemble. 

This construction generates a rooted tree, with impor- 
tant consequences for joint degree distributions of adja- 
cent nodes. The direction of growth leaves its imprint on 
them. For ordinary undirected random graphs (Erdos- 
Renyi graphs), it is well known that the degree distribu- 
tion for pairs of nodes obtained by randomly choosing a 
link is different from that obtained by choosing any two 
nodes at random. If the degree distribution is p^, the 
distribution of degree pairs for linked nodes is not PkPk' , 
but kk'pkPk' I (k) 2 , because higher degree nodes have a 
greater chance of being attached to a randomly chosen 
link. For the present model, two connected nodes are 
always in a mother - daughter relationship. In particu- 
lar, all nodes have in-degrce one; that is, they have one 
mother (except for the root). If k is the owi-degree of the 
mother and k' the owi-degree of the daughter, then the 
distribution of degree pairs obtained by randomly choos- 
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ing links is 

kpkPk' _ kpkPk' 

lptPi> (k) U 

While high degree mothers have a greater chance of ap- 
pearing in a pair than low degree mothers, no such bias 
holds for daughters. Otherwise said, if we pick a random 
node, the out-degrees of its daughters will be distributed 
according to pk>, while the out-degree of its mother is 
distributed oc kpk- Notice that this implies that our en- 
semble of critical trees is not equivalent to the ensemble 
of critical Erdos-Renyi graphs. 

In the following, we shall always denote by pk the dis- 
tribution of out-degrees, and we will, for simplicity, al- 
ways call k the "degree" (even though the real degree is 
fc + 1). 



B. Evolution of the degree distribution 

The probability that a randomly chosen node in a net- 
work has degree k is pk = rik/N. The change of rik in 
one step of renormalization has three contributions, 



=r k + s k + q kl (4) 

where: 



• Tk is a loss term associated with the possibility that 
the target had (old) degree k before the considered 
renormalization step. It is 

Tk = -Pk ■ (5) 



III. ANALYTICAL CALCULATIONS AND 
SIMULATION RESULTS 



A. Evolution of the tree size, N 



Let n k be the number of nodes with degree k, and N = 
TikTik the total number of nodes in the tree (i.e., its size, 
at a given step). Both N and rik are fluctuating functions 
of time t. Since target nodes are picked randomly, the 
average degree of the target is (A;) = N~ 1 Y,kkrik = 1 — 
1/N, where the last equality follows from the fact that 
the total number of links in a tree is always N — 1. Since 
all the target's neighbors (both its mother, unless it is the 
root, and any daughters) arc deleted in the subsequent 
renormalization step, we get the exact result 



AN 



= -(k)-l 



1 

N 



= -2 



2 

N 



(2) 



Here the overline denotes an average over the random- 
ness of the last step only, while brackets denote ensemble 
averages (except for (k)) including also the randomness 
from previous RSR steps. Approximating t by a contin- 
uous variable and performing such an ensemble average 
gives 



(AT) = AT - 2t + In 



( N -l 
\{N}-1 



(3) 



(The integration can only be performed for N > 1.) We 
have replaced (1/N) on the right hand side of Eq. ^ by 
1/{N), which is a mean-field approximation. We show 
in Sec. |III E| that this mean-field regime extends up to a 
time when N ~ 0(N^ /2 ). 



• Sk is a loss term from the (old) neighbors of the 
target having degree k. Assuming no degree corre- 
lations, which is also a mean-field approximation, 
and summing over all (old) degrees k' of the target 
gives 



Sk 



V k'pk'Pk -y^Pk'l F^y- 



-(k) P k 



kpk 
(k) 



-{l + k) P k 



(6) 



Here the first term is the contribution of the daugh- 
ters, while the second is due to the mother. This 
assumes that the target is not the root. For sim- 
plicity we shall neglect that possibility in the fol- 
lowing, which makes errors of 0(1/N). These are 
negligible for large N. The last line follows from 
(k) = 1 — 1/AT « 1, which is a good approximation 
for the same reason. 



qk is a gain term arising from the possibility that 
the target acquires new degree k. Assume that the 
old degree of the target was m, that the degrees of 
its daughters were k±, . . . , k m , and that the degree 
of its mother was ko — and that all degrees are 
uncorrelated. Then 



Qk 



koPkp 
(k) 



+...k m -l,k ■ 



(7) 



This term is not very transparent. For a more 
tractable formulation we use the generating func- 
tion methods discussed next. 
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C. Generating Functions 

The generating function for pk is 

G{x)=^p k x k , 

k 

and moments of the distribution are given by 

d 



(k m ) = 



dx 



G(x) 



(8) 



(9) 



Similarly, the generating function for the gain term is 

Q(x) = J2<lkX k ■ ( 10 ) 

k 

If a variable has a given generating function, then the 
generating function for the sum of that variable over m 
independent realizations is given by the m th power of 
that generating function [25]. Hence, if the target node 
has degree m, the generating function for the sum of 
degrees of all its daughters is [G(x)] m . Using the above 
definitions and G'(l) = (k) « 1, we get 

Q(x) = Y J PmG'(x)G m {x) = G'(x)G{G{x)) . (11) 

m 

This, together with Eqs. ^ through ([6]), leads to 



^^ = ±[G'(x)G(G(x))-xG'(x)]+0(l/N 2 ) . (12) 

A more tedious calculation, which requires generating 
functions for the root of the tree - arrives at the ne- 
glected 0(1/N 2 ) terms. The exact result (assuming no 
correlations) is 



= ±[G'(x)G(G(x))-xG'(x)] 



1 

iV2 



[G(G(x)) - G(x)] 



(13) 



One checks easily that this satisfies the conditions that 
G(l) is constant and G'(l) = 1 - 1/N for all t. 



D. Variance of the degree distribution 



Obtaining the time evolution of the variance of the 
degree distribution requires an expression for the time 
evolution of the second derivative of G. From Eq. ( 12 ) it 
follows that 



AG"(1) 2G"(1) 



At 



N 



0{l/N 2 ) 



(14) 
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FIG. 2: (Color online) Comparison between the variance of 
the degree distribution obtained from Eq. ( |16[ ) and simula- 
tions for different system sizes, iVo- The mean-field theory 
extends over a larger range for increasing No- The inset 
shows that the maximum variance in RSR observed numer- 

1/2 

ically scales as N , in agreement with our scaling ansatz 
Eq. {T7}. 



Making the same steps and approximations as in subsec- 
tion |III A| gives 



dG"{x) 
dt 



d(k 2 - 


k) 


dt 




2(k 2 - 


k) 


(N) 




2(k 2 - 


k) 


iV - 


2t 



0{l/(N) 2 ) 



(15) 



Integrating, fixing the integration constant by the condi- 
tion (fc 2 )o = 3/2 + 0(l/iVo), and rewriting the result in 
terms of the variance of the degree distribution a 1 gives 



(k 2 ) - {kf 



2{N - 2t) ~ 2N 



(16) 



In Fig. [2] we compare Eq. (16 1 for the variance of the 
degree distribution with numerical simulations of RSR 
for different initial sizes of critical trees. We see per- 
fect agreement at early times, but increasingly larger 
disagreement at later times. This is only in part due 
to the neglected higher order terms in 1/iV. Another 
source of error at late times is that N exhibits large 
fluctuations compared to its average. Also, degree cor- 
relations develop. Hence, the mean-field approximation 
breaks down for large t. But we also see from Fig. [2] that 
agreement between theory and numerical results extends 
over a broader range for increasing system size Nq. 

To understand better the behavior at late times (small 
N/Nq), we replot the same data using a finite-size scaling 
(FSS) method in Fig. [3j This plot demonstrates that the 
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FIG. 3: (Color online) Scaling of the variance of the degree 
distribution obtained from RSR. The data are the same as in 
Fig. [2] but the axes are diffe rent. They are chosen according 
to the scaling ansatz Eq. ( 17 1, and give excellent data collapse. 



The straight line has slope m = 2. 



we explicitly label the fluctuating number of nodes with 
its time dependence N t . 

Generalizing Eq. |2]) and neglecting the 0(1/N) term, 
we make the ansatz 



AJV t 
At 



= -2 + e t 



(18) 



Here e is a random variable with zero mean and with vari- 
ance equal to the variance of the degree distribution a 2 , 
which, on average, increases with time t. Assuming no 
degree correlations, the random variables e t at different 
times are also uncorrelated, and 



(e t e t >) = h,fo\ ■ 
Thus the fluctuations of Nt are given by 

SN t =N t - (N t ) 



t-i 

E 

t'=0 



et' 



(19) 



(20) 



Since a t is finite for all t, the central limit theorem implies 
that SNt is Gaussian for large t with variance 



scaling ansatz 



N ( N 



N 



(—) 



(17) 



with scaling exponent v = 1/2 gives excellent data col- 
lapse. We derive the result v = 1/2 in the next sub- 
section. The scaling function g(x) satisfies g(x) — > 1/2 
for x — ¥ oo, in agreement with Eq. (16). In addition, 



the network must, by definition, end up as a star before 
it collapses. Assuming that the star consists of a central 
hub surrounded by low degree nodes (which is verified nu- 
merically), its variance will scale with its size as a 2 ~ N. 
Also, the variance of the degree distribution of the star 
must be independent of the initial size Nq. These consid- 
erations lead to the conclusion that g(x) — > x 2 as x —> 0. 
Finally, in the scaling ansatz, g and its derivative are 
continuous functions. As a result the maximum variance 

1/2 

occurs when N ~ N n so that the maximum value of 



N 



1/2 



in agreement with the inset of Fig. [2J Scaling 
laws like Eq. (171 in terms of homogeneous functions are 



well known from critical phenomena [3I1]> where they de- 
scribe finite-size scaling with several control parameters 
such as temperature and magnetic field. 



E. Fluctuations of the system size and the 
relaxation time 



In this subsection we derive the result v = 1/2 by con- 
sidering fluctuations around the average value of AN/ At, 
and the resulting fluctuations both of Nt and of the re- 
laxation time T. (Recall that the latter is defined as the 
time when the tree is first reduced to a single node.) Here 



Vta[6N t ] = 

t'=0 



t-i 



Nn 



N N 
— In -, — r 
4 (N t ) 



(21) 



This estimate has to break down when typical fluctua- 
tions of N t are as big as its average, or when Var[5JV t ] w 
(N t ) 2 . We claim that this happens at a time when 

1 /2 

(Nt) ~ N ' , explaining the fact that v = 1/2. Indeed, 
when (Nt) ~ Nq with some positive exponent v, then 
Vax[6N t ] ~ AolnAo > N for large N , implying that 
it is larger than (A^ t ) 2 for any v < 1/2. On the other 
hand, Vax[SNt] increases less quickly than (N t ) 2 for any 
v > 1/2, showing that the initial scaling regime breaks 
down when (N t ) ~ N% with v = 1/2. 

Fluctuations of the relaxation time T are obtained by 
demanding that Nt = 1, which gives 



2T 



T-l 

E 

t'=0 



et' 



N Q 



(22) 



Hence, for large No, T is distributed as an inverse Gaus- 
sian variate which is well approximated in the large Nq 
limit by an ordinary Gaussian. Strictly, its variance can- 
not be calculated exactly, since the summation extends 
beyond the limit of applicability of our theory. To take 

this into account, we first convert the summation over t' 

1/2 

to an integral over N and truncate the integral at N ' , 
where the mean-field theory breaks down. Integration 
gives 



Var[5T] = —NohiNo 



(23) 
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FIG. 4: (Color online) Distributions of relaxation times for 
various values of No- The inset compares the variance of 



these distributions to Eq. ( 23 \ finding good agreement 
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FIG. 5: (Color online) FSS analysis of fc ma x using Eq. (24 1 
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There is perfect data collapse in the region N ~ N . 



plus lower order terms. This is compared with the sim- 
ulation results shown in the inset of Fig. |4j finding good 
agreement. 



F. Scaling of maximum degree 

A simple way to track the formation of hubs under RSR 
is to measure the maximum degree in the network fc max . 
A naive scaling assumption is that when a few large hubs 
together with many low degree nodes dominate, a 1 ~ 
kl ax /N. Using Eq. Eft gives 



K' 2 f 



N 



N, 



1/2 



(24) 



Figure [5] compares this equation to results from numer- 
ical simulations. While there are clear (and expected) 

1/2 

deviations for N/N ' — > oo, the collapse in the interme- 
diate region N ~ N^ 2 , where cr 2 achieves its maximum, 
is perfect. As before, assuming that the tree evolves to 
a star with a hub at its center suggests that f(x) ~ x as 
x 0. However in Fig. [5] we do not observe this behav- 
ior as the fitting region is small and there is still some 
curvature in the scaling function. As for tr 2 , our theory 
predicts that the largest value of /c max observed under 

1/2 

RSR scales as N and agrees with the data seen in the 
inset of Fig. [5j 



G. Ratio of the largest degree to the second largest 
degree 



The ratio of fc max 
(provided that fc max . 



to the second largest degree fc maXj2 
2 > 0) is shown in Fig. [6j It agrees 
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FIG. 6: (Color online) FSS analysis of fc max /fc max ,2. This 
ratio increases as one large hub separates from the rest of the 
degree distribution. The data show good agreement with the 
scaling ansatz Eq. (25 1. The line with slope —1 indicates the 



theoretical prediction as the network approaches a star. 



with an FSS analysis using the same exponent v = 1/2, 



= h 



N 



^max,2 



N, 



1/2 



(25) 



Once again the extreme limits of the scaling function h 
can be determined. For the initial network the largest 
and second largest degree are equal, so h(x — > oo) — > 1. 
For a pure star of size JV, k max /k m&K 2 = N. As shown 
in Section III J stars first appear when N ~ Nq star with 

N^ A . Hence 



fstar ~ 1/4- In that case fc max /^ max ,2 
h{N~ 1/A ) - A^ 1/4 , or h(x -> 0) ~ l/x. 
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' 



TV = 2iV n 1/2 and N 



0.5/V 1/2 . 



These distribu- 



tions are obtained by averaging over different initial networks 
and different realizations of RSR. The distribution widens and 
then becomes more narrow on decreasing TV as hubs separate 
from the rest of the nodes during the transition. The data are 
consistent with our theoretical prediction that at the critical 
point pk ~ fc -7 with 7 = 2. 



that h is increasing in this limit, although the asymptotic 
regime is not yet reached for the system sizes studied. 



H. Degree distribution 

Degree distributions for large initial trees at three 
points in the evolution are shown in Fig. [7J Critical trees 
start with a narrow degree distribution, which becomes 
broader and broader under RSR. The degree distribution 
gradually transforms into a power law distribution as N 

1/2 

approaches ~ N . For a power law degree distribution 



p(k) ~ k 7 , the variance obeys 



From the scaling result at the transition, a 2 



(26) 



1/2 



we get 7 = 2, consistent with the data shown. 



With the formation of a giant hub at the transition, a 
bump appears at large k in p^ . This is clearly visible for 
N = 0.5/Vq //2 in Fig. I7J Note that the distributions shown 
in this figure arc obtained by averaging over many initial 
networks and many realizations of RSR. In the degree 
distribution of a single network a gap emerges between 
the largest hub and the rest of the nodes, for N 
as demonstrated in Fig. [6j 
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Agreement with Eq. 
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indicated. Fluctuations cannot be ignored for small N/N^ 2 
when mean-field theory breaks down and the bounds are no 
longer valid. 
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1/2 

shell at N — N . The other shells vanish increasingly faster 
as N decreases further. 



I. Mean-field theory for average radius of trees 



The sum of the distances of nodes from the root in a 
tree of size N can be written as 



N-l 



(27) 



9 



where g x is the distance of node x from the root. It is 
simplest to consider that (except for the root) the mother 
of a target node absorbs her (target) daughter plus all of 
that daughter's daughters. Consider node x at distance 
g x > 1. If the root is the target in the next RSR step, g x 
is reduced by 1. If an ancestor of x's mother is hit, which 
is not the root, then g x is reduced by 2. If either x or her 
mother is the target, then x disappears, contributing zero 
to R. Hence the position of x evolves in the continuous 
time approximation on average as 



N 



dg x 

at 



-1 - 2(g x - 2) - 2g x 



for x > 1. For x = 1 



N 



dg x 
dt 



(28) 



(29) 



We can write the evolution in terms of the average num- 
ber of nodes instead of time. As before, in mean-field 
we ignore fluctuations in N about its average (N), in R 
about its average (R) , and in the number of nodes at dis- 
tance 1 in the tree Si about its average (Si). This gives, 
after dropping all angular brackets, 



dR 
dN 



2R 



2N 



(30) 



Defining the average radius r = R/N with initial value 

1 /2 

ro = aN for large No, the constant a ~ 0(V) de- 
pends on the precise rule for constructing critical trees. 
Equation ( 30 ) can be solved to get 



r(JV) = f(l 



N 
iVo 



N 



N 



1/2 



Nu 



Si 



(31) 
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FIG. 10: (Color online) FSS analysis for the distribution of 
last sizes based on Eqs. (|33[)-(|36[) with r = 1.4 ± 0.1 and 
D = 0.25 ± 0.07. In view of the comment after Eq. pi| , 
p{Nt) is replaced with p(N e )/2 for N e = 2. 



J. Distribution of last sizes and the star regime 



Before the network reaches the trivial fixed point at 
N = 1 it must first turn into a star. The star eventually 
collapses into a single node when the central node is hit 
as the target. 



Bounds on r(N) can be placed based on the fact that 
1 < Si < N to get 



1 



N 
No 



N, 
N 



N 

~TJ2 



N 

1 N 

AN 



< r < 



4iV 2 



(32) 



These bounds are tested against numerical data in Fig. [8] 
showing excellent agreement, up until the regime where 

1/2 

N becomes small compared to N Q . At that point mean- 
field theory breaks down. As the trees start to exit the 
mean-field regime, their average radius becomes order 

1/2 n 

unity even for N ~ iV ' — > oo. Figure p shows the evo- 
lution of the average number of nodes at distances 1, 2, 3, 
and 4 from the root, (Si, S2, S3, and S4, respectively). 

1 /2 

At N = N Q , Si becomes the largest shell, and S2 seems 
to be exactly equal to Si at that point. All other shells 
vanish compared to Si for smaller N. This is the origin 
of the finite radius of renormalized trees near the end of 
the mean-field regime. 



We define the quantity Ng to be the size of the network 
one step before it dies. Figure [10] shows an FSS plot 
for the probability distribution of Nt. More precisely, it 
shows Nj A p(Ni) against Ne/N^ 4 . The data collapse 
seen suggests a scaling form 



1 



P(Nt) - W ®(Ni/No D ), 



(33) 



with r = 1.4±0.1, D = 0.25±0.05. The scaling function 
$(a;) seems to approach a constant for x — > 0, suggesting 
that p(Nf) tends to a power law, p(Ng) <~ NJ T , for Nt <C 



1/4 



N 



From the distribution of Nt we can determine the dis- 
tribution of sizes when the tree first turns into a star. 
Let us call s the size when the renormalized tree first 
reaches a star configuration, and p s (s) its distribution. 
In each subsequent time step the star can either shrink 
by exactly one node (probability 1 — 1/N), or it can be 
reduced immediately to a single node (probability 1 /N) . 
Starting with a star of size s, the conditional probability 



10 



to end up at final size Ng is 
1 



p(N t \a) 



s-N e 

n 

2 

s 



s-t 1 



N. = B 



2<N/<s 



N t = 2 



(34) 

where the last line comes from the degeneracy of a star 
with two nodes and is required for proper normalization. 
Assuming that p s (s) has a scaling form with possibly new 
exponents and a new scaling function <f>, 



1 



p s (s) ~ -^<Ks/Ng) , 



(35) 



we obtain 



p(N e 



J2 p(N e \s) Ps {s) 

s>N e 



ds- 



1 



*{Nt/N$) 



(36) 



with V(x) = x a f™dx' 4>{x')/x' 1+a . This agrees with 
Eq. p3| , if we identify a = r, ft = D, and *(af) = 
Thus the distributions of s and of Ng have the same ex- 
ponents, if they obey FSS, which we verified numerically. 



IV. CONCLUSION 

To study invariant properties of graphs under coarse 
graining, we have introduced the random sequential 
renormalization (RSR) method, where in each step only 
a part of the network within a fixed distance b from a 
randomly chosen node collapses into one node. RSR is 
easy to implement and eliminates the problem of finding 
an optimum tiling of the network. In addition, the small 
effect of each decimation gives a much more detailed sta- 
tistical picture of the renormalization flow. We applied 
the RSR with b = 1 to critical trees and derived results 
analytically, finding good agreement with numerical sim- 
ulations. 

Under renormalization a critical regime appears when 
the size of the tree N ~ JVq with v = 1/2. The behavior 
of the tree before this regime is reached is described using 
a mean-field theory based on generating functions. There 
is a constant c ~ 1 such that the degree distribution of 
the network is scale free, pk ~ k ' with 7 = 2, in the 

1 /2 

limit JVq — > 00 and JV/JV = c. Both the variance of the 

degree distribution a 1 and the maximum degree in the 

1/2. 

network fc max diverge as JV n in this limit. Both of these 



quantities are described by crossover functions exhibit- 
ing finite-size scaling that connect the mean-field regime 
to a regime for JVq^ 4 < N < JVq^ 2 when hubs start to 
emerge. Results from numerical simulations agree with 
a scaling theory we develop to describe this fixed point. 
Trees are short and fat near this point with an average 
depth O(l). As RSR proceeds further, star configura- 
tions start to appear for JV ~ JVq 8 *'" with v stal m 1/4. 
The distribution of star sizes seems to obey FSS, charac- 
terized by its own critical exponents, which we were not 
able to derive analytically. 

We began this investigation to study in a more con- 
trolled way claims made in the literature about real-space 
renormalization of complex networks [5J 1141 115j . In the 
most detailed previous study [T3J [H] many of the find- 
ings are similar to ours, with the caveat that unlike pre- 
vious works, the results presented here are for critical 
trees rather than for general networks. The most strik- 
ing and robust agreement is the emergence of hubs under 
renormalization - which leads to a final star regime. As- 
sociated with the emergence of hubs is a fixed point that 
gives rise to a power law degree distribution. 

An alternative way to describe RSR is the following: 
Instead of removing nodes in each coarse graining step 
and replacing them by a new "super" -node, we keep them 
and join them into a cluster. At each subsequent RSR 
step, entire clusters are joined into new "superclusters." 
This process, where clusters grow by attaching to all the 
neighbors is an aggregation process [33] is called "ag- 
glomerative percolation" (AP) in Ref. (33] • The origi- 
nal network has only clusters of size one, but larger and 
larger clusters appear as the RG flow goes on. At the 
critical point, an infinite cluster (in the limit Nq — > 00) 
appears. In this interpretation, the critical behavior seen 
in this paper (and in Refs. [El [15]) is just a novel type 
of percolation. 

If the original network is a simple chain, the probability 
distribution to find any sequence of masses for any 6, 
initial size JVq, and time t have been derived exactly. In 
this case, AP exhibits critical exponents different from 
ordinary percolation. These exponents depend on b |23j . 
In two dimensions on a square lattice, AP is in a different 
universality class than ordinary percolation |24j . 

In future work [35] we plan to study RSR on net- 
works that are more complex than trees. For Erdos-Renyi 
graphs we have found a fixed point at finite ratio N/N 
associated with the emergence of hubs, which in the case 
of critical trees and of simple chains is driven to zero. 
This difference between trees and Erdos-Renyi graphs is 
intuitively most easily understood in the percolation pic- 
ture discussed above. Trees having topological dimension 
one, any percolation transition on them can only happen 
when the probabilities for establishing bonds goes to one. 

It remains to be seen whether RSR (or equivalently 
AP) can be used as a generic tool to uncover universal- 
ity classes in large networks (in the usual renormalization 
group sense) by eliminating irrelevant degrees of freedom. 
On a more speculative note, our results point to another 
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way to create scale free networks that is not based on 
an explicit generative mechanism for power law behav- 
ior at the microscopic scale, but result from hubs being 
aggregates of many microscopic nodes. That would sug- 
gest the view that networks are emergent collections of 
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